Goto

Collaborating Authors

 nature chemistry


Machine learning overcomes human bias in the discovery of self-assembling peptides - Nature Chemistry

#artificialintelligence

Peptide materials have a wide array of functions, from tissue engineering and surface coatings to catalysis and sensing. Tuning the sequence of amino acids that comprise the peptide modulates peptide functionality, but a small increase in sequence length leads to a dramatic increase in the number of peptide candidates. Traditionally, peptide design is guided by human expertise and intuition and typically yields fewer than ten peptides per study, but these approaches are not easily scalable and are susceptible to human bias. Here we introduce a machine learning workflow—AI-expert—that combines Monte Carlo tree search and random forest with molecular dynamics simulations to develop a fully autonomous computational search engine to discover peptide sequences with high potential for self-assembly. We demonstrate the efficacy of the AI-expert to efficiently search large spaces of tripeptides and pentapeptides. The predictability of AI-expert performs on par or better than our human experts and suggests several non-intuitive sequences with high self-assembly propensity, outlining its potential to overcome human bias and accelerate peptide discovery. Peptide design remains a challenge owing to the large library of amino acids. Rational design approaches, although successful, result in a peptide design bias. Now it has been shown that AI techniques can be used to overcome such bias and discover unusual peptides as efficiently as humans.


Best practices in machine learning for chemistry - Nature Chemistry

#artificialintelligence

The application of statistical machine learning techniques in chemistry has a long history1. Algorithmic innovation, improved data availability, and increases in computer power have led to an unprecedented growth in the field2,3. Extending the previous generation of high-throughput methods, and building on the many extensive and curated databases available, the ability to map between the chemical structure of molecules and materials and their physical properties has been widely demonstrated using supervised learning for both regression (for example, reaction rate) and classification (for example, reaction outcome) problems. Notably, molecular modelling has benefited from interatomic potentials based on Gaussian processes4 and artificial neural networks5 that can reproduce structural transformations at a fraction of the cost required by standard first-principles simulation techniques. The research literature itself has become a valuable resource for mining latent knowledge using natural language processing, as recently applied to extract synthesis recipes for inorganic crystals6.